Search CORE

202 research outputs found

GPU-Accelerated BWT Construction for Large Collection of Short Reads

Author: Lam Tak-Wah
Liu Chi-Man
Luo Ruibang
Publication venue
Publication date: 29/01/2014
Field of study

Advances in DNA sequencing technology have stimulated the development of algorithms and tools for processing very large collections of short strings (reads). Short-read alignment and assembly are among the most well-studied problems. Many state-of-the-art aligners, at their core, have used the Burrows-Wheeler transform (BWT) as a main-memory index of a reference genome (typical example, NCBI human genome). Recently, BWT has also found its use in string-graph assembly, for indexing the reads (i.e., raw data from DNA sequencers). In a typical data set, the volume of reads is tens of times of the sequenced genome and can be up to 100 Gigabases. Note that a reference genome is relatively stable and computing the index is not a frequent task. For reads, the index has to computed from scratch for each given input. The ability of efficient BWT construction becomes a much bigger concern than before. In this paper, we present a practical method called CX1 for constructing the BWT of very large string collections. CX1 is the first tool that can take advantage of the parallelism given by a graphics processing unit (GPU, a relative cheap device providing a thousand or more primitive cores), as well as simultaneously the parallelism from a multi-core CPU and more interestingly, from a cluster of GPU-enabled nodes. Using CX1, the BWT of a short-read collection of up to 100 Gigabases can be constructed in less than 2 hours using a machine equipped with a quad-core CPU and a GPU, or in about 43 minutes using a cluster with 4 such machines (the speedup is almost linear after excluding the first 16 minutes for loading the reads from the hard disk). The previously fastest tool BRC is measured to take 12 hours to process 100 Gigabases on one machine; it is non-trivial how BRC can be parallelized to take advantage a cluster of machines, let alone GPUs.Comment: 11 page

arXiv.org e-Print Archive

CiteSeerX

Trade-offs between speed and processor in hard-deadline scheduling

Author: Lam Tak Wah
To Kar Keung
Publication venue: Society for Industrial and Applied Mathematics.
Publication date: 01/01/1999
Field of study

This paper revisits the problem of on-line scheduling of sequential jobs with hard deadlines in a preemptive, multiprocessor setting. An on-line scheduling algorithm is said to be optimal if it can schedule any set of jobs to meet their deadlines whenever it is feasible in the off-line sense. It is known that the earliest-deadline-first strategy (EDF) is optimal in a one-processor setting, and there is no optimal on-line algorithm in an m-processor setting where m≥2. Recent work however reveals that if the on-line algorithm is given faster processors, EDF is actually optimal for all m (e.g., when m = 2, it suffices to use processors 1.5 times as fast). This paper initiates the study of the trade-off between increasing the speed and using more processors in deriving optimal on-line scheduling algorithms. Several upper bound and lower bound results are presented. For example, the speed requirement of EDF can be reduced to 2-1+p/m+p when it is given p≥0 extra processors. The main result is a new on-line algorithm which demands less speedy processors so as to attain optimality (e.g., when m = 2, the speed requirement is 1 1/3) and admits a better speed-processor trade-off than EDF (e.g., when m = 2 and p = 1, the speed requirement is 1.2). In general, no optimal algorithm exists when the speed factor is less than 1/(2√2+p/m-2).published_or_final_versio

HKU Scholars Hub

A Decomposition Theorem for Maximum Weight Bipartite Matchings

Author: Kao Ming-Yang
Lam Tak-Wah
Sung Wing-Kin
Ting Hing-Fung
Publication venue
Publication date: 01/01/1999
Field of study

Let G be a bipartite graph with positive integer weights on the edges and without isolated nodes. Let n, N and W be the node count, the largest edge weight and the total weight of G. Let k(x,y) be log(x)/log(x^2/y). We present a new decomposition theorem for maximum weight bipartite matchings and use it to design an O(sqrt(n)W/k(n,W/N))-time algorithm for computing a maximum weight matching of G. This algorithm bridges a long-standing gap between the best known time complexity of computing a maximum weight matching and that of computing a maximum cardinality matching. Given G and a maximum weight matching of G, we can further compute the weight of a maximum weight matching of G-{u} for all nodes u in O(W) time.Comment: The journal version will appear in SIAM Journal on Computing. The conference version appeared in ESA 199

arXiv.org e-Print Archive

CiteSeerX

HKU Scholars Hub

MEGAHIT: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph

Author: Lam Tak-Wah
Li Dinghua
Liu Chi-Man
Luo Ruibang
Sadakane Kunihiko
Publication venue
Publication date: 23/12/2014
Field of study

MEGAHIT is a NGS de novo assembler for assembling large and complex metagenomics data in a time- and cost-efficient manner. It finished assembling a soil metagenomics dataset with 252Gbps in 44.1 hours and 99.6 hours on a single computing node with and without a GPU, respectively. MEGAHIT assembles the data as a whole, i.e., it avoids pre-processing like partitioning and normalization, which might compromise on result integrity. MEGAHIT generates 3 times larger assembly, with longer contig N50 and average contig length than the previous assembly. 55.8% of the reads were aligned to the assembly, which is 4 times higher than the previous. The source code of MEGAHIT is freely available at https://github.com/voutcn/megahit under GPLv3 license.Comment: 2 pages, 2 tables, 1 figure, submitted to Oxford Bioinformatics as an Application Not

arXiv.org e-Print Archive

HKU Scholars Hub

An Even Faster and More Unifying Algorithm for Comparing Trees via Unbalanced Bipartite Matchings

Author: Kao Ming-Yang
Lam Tak-Wah
Sung Wing-Kin
Ting Hing-Fung
Publication venue
Publication date: 01/01/2001
Field of study

A widely used method for determining the similarity of two labeled trees is to compute a maximum agreement subtree of the two trees. Previous work on this similarity measure is only concerned with the comparison of labeled trees of two special kinds, namely, uniformly labeled trees (i.e., trees with all their nodes labeled by the same symbol) and evolutionary trees (i.e., leaf-labeled trees with distinct symbols for distinct leaves). This paper presents an algorithm for comparing trees that are labeled in an arbitrary manner. In addition to this generality, this algorithm is faster than the previous algorithms. Another contribution of this paper is on maximum weight bipartite matchings. We show how to speed up the best known matching algorithms when the input graphs are node-unbalanced or weight-unbalanced. Based on these enhancements, we obtain an efficient algorithm for a new matching problem called the hierarchical bipartite matching problem, which is at the core of our maximum agreement subtree algorithm.Comment: To appear in Journal of Algorithm

arXiv.org e-Print Archive

HKU Scholars Hub

Continuous Monitoring of Distributed Data Streams over a Time-based Sliding Window

Author: Chan Ho-Leung
Lam Tak-Wah
Lee Lap-Kei
Ting Hing-Fung
Publication venue
Publication date: 01/01/2010
Field of study

The past decade has witnessed many interesting algorithms for maintaining statistics over a data stream. This paper initiates a theoretical study of algorithms for monitoring distributed data streams over a time-based sliding window (which contains a variable number of items and possibly out-of-order items). The concern is how to minimize the communication between individual streams and the root, while allowing the root, at any time, to be able to report the global statistics of all streams within a given error bound. This paper presents communication-efficient algorithms for three classical statistics, namely, basic counting, frequent items and quantiles. The worst-case communication cost over a window is

O(\frac{k} {\epsilon} \log \frac{\epsilon N}{k})

bits for basic counting and

O(\frac{k}{\epsilon} \log \frac{N}{k})

words for the remainings, where

k

is the number of distributed data streams,

N

is the total number of items in the streams that arrive or expire in the window, and

\epsilon < 1

is the desired error bound. Matching and nearly matching lower bounds are also obtained.Comment: 12 pages, to appear in the 27th International Symposium on Theoretical Aspects of Computer Science (STACS), 201

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

HKU Scholars Hub

Cavity Matchings, Label Compressions, and Unrooted Evolutionary Trees

Author: Kao Ming-Yang
Lam Tak-Wah
Sung Wing-Kin
Ting Hing-Fung
Publication venue
Publication date: 01/01/2000
Field of study

We present an algorithm for computing a maximum agreement subtree of two unrooted evolutionary trees. It takes O(n^{1.5} log n) time for trees with unbounded degrees, matching the best known time complexity for the rooted case. Our algorithm allows the input trees to be mixed trees, i.e., trees that may contain directed and undirected edges at the same time. Our algorithm adopts a recursive strategy exploiting a technique called label compression. The backbone of this technique is an algorithm that computes the maximum weight matchings over many subgraphs of a bipartite graph as fast as it takes to compute a single matching

arXiv.org e-Print Archive

CiteSeerX

HKU Scholars Hub

Efficient parallel algorithms for some subsequence problems

Author: Chin Yo-lun, Francis
Lam Tak-wah
Tsang Wai-wan
Publication venue: Department of Computer Science, University of Hong Kong
Publication date: 01/01/1989
Field of study

published_or_final_versio

HKU Scholars Hub